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Abstract 

Background: All Shigella flexneri serotypes except serotype 6 share a common O-antigen tetrasaccharide backbone 
and nearly all variations between serotypes are due to glucosyl and/or O-acetyl modifications of the common 0 unit 
mediated by glycosyltransferases encoded by serotype-converting bacteriophages. Several S. flexneri serotype-converting 
phages including SfV, SfX, Sf6 and Sfll have been isolated and characterized. However, S. flexneri serotype-converting 
phage Sfl which encodes a type I modification of serotype 1 (1a, 1b, 1c and 1d) had not yet been characterized. 

Results: The Sfl phage was induced and purified from a 5. flexneri serotype 1a clinical strain 019. Electron microscopy 
showed that the Sfl phage has a hexagonal head and a long contractile tail, characteristic of the members of Myoviridoe 
family. Sfl can convert serotype Y to serotype 1a and serotype X to serotype 1d, but cannot convert 10 other S. flexneri 
serotypes (1a, 1b, 2a, 2b, 3a, 3b, 4a, 4b, 5a, Xv) tested, suggesting that Sfl has a narrow host range. Similar to other 5. 
flexneri serotype-converting phages, Sfl integrates into the tRNA-thrW gene adjacent to proA of the host chromosome 
when lysogenized. The complete sequence of the Sfl genome was 38,389 bp, encoding 66 open reading frames and 
two tRNA genes. Phage Sfl shares significant homology with S. flexneri phage SfV, Escherichia coli prophage e14 and 
lambda, and is classified into the lambdoid phage family. Sfl was found to use a cos mechanism for DNA packaging 
similar to that of phage SfV. 

Conclusions: Sfl contains features of lambdoid phages and is closely related to S. flexneri phage SfV, £ coli prophage e14 
and lambda. The characterization of Sfl enhances our understanding of serotype conversion of S. flexneri. 



Background 

Shigella is the major cause of endemic bacillary dysentery 
(shigellosis) in developing countries. It is estimated that 
there are about 1647 million cases of shigellosis annually 
worldwide, of which 163.2 million were in developing 
countries, resulting in 1,1 million deaths, most of which 
were children under 5 years of age [1]. Among the four 
Shigella species, S. dysenteriae, S. flexneri, S. boydii, and S. 
sonnei, S. flexneri is the predominant species. 

Based on the combination of antigenic determi 
nants present in the O-antigen of the cell envelope 
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lipopolysaccharide (LPS), S. flexneri is further divided 
into various serotypes. To date, at least 16 serotypes 
have been recognized [2-4]. Except for serotype 6, all 
share a basic repeating tetrasaccharide unit, comprised 
of one GlcNAc and three rhamnoses [4], Modifications 
to the side chain of the tetrasaccharide by the addition 
of glucosyl and/or O-acetyl groups give rise to various 
antigenic determinants [3]. The genes responsible for 
the O-antigen modification are always either the gene 
cluster gtrABC for glucosyl groups or the single oac gene 
for the O-acetyl group; all encoded by serotype- 
converting bacteriophages [3,5-10]. In all glucosylation 
modification phages, the gtrABC gene cluster is always 
located immediately upstream of the attP site, followed 
by the int and xis genes [6]. 

Up to now, four S. flexneri serotype-converting 
bacteriophages, SfV, SfX, Sf6 and Sfll, have been induced 
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and purified by different groups [8,11-13]. Morphologic- 
ally, SfV and Sfll, which have an isometric head and a 
long tail, belong to Group A in the family of Myoviridae 
[8,11]; while SfX and Sf6, which possess a short tail 
linked to an isometric head, belong to the family of 
Podovirida [12,13]. The complete genome sequences of 
phage SfV and Sf6 have been obtained by directly se- 
quencing the phage DNA purified from phage particles, 
and their genetic features have been well characterized 
[9,10]. Recently, the prophage genome of SfX was 
determined from the sequenced S. flexneri serotype Xv 
strain 2002017; which is presumably the whole genome 
of phage SfX, because a SfX phage particle can be 
induced and isolated from 2002017 [2]. The SfX genome 
is 37,355 bp length, encoding 59 ORFs (unpublished 
data). The genome of Sfll has not yet been sequenced 
from free phage particles, but prophage genomes can be 
derived from sequenced S. flexneri serotype 2a strains 
Sf301 and 2457T [14,15], which show considerable vari- 
ation with one or both being prophage remnants. 

S. flexneri serotype 1 is defined by reaction with type I 
antisera. A total of 4 subtypes, la, lb, lc and Id have 
been recognized [16-18]. In serotype 1, a glucosyl group 
is attached to the GlcNac residue of the repeating unit 
by an alpha- 1, 4 linkage, which results in the presence of 
serotype 1 -specific I antigen. The type I modification is 
mediated by an O-antigen glucosylation locus (gtrl, gtrA, 
gtrB) encoded on the Sfl prophage genome [5]. The 
glucosylation genes and flanking partial Sfl sequences 
were previously obtained from a serotype la strain Y53 
[17]. However, the free phage particle of Sfl had not 
been isolated, and its full genomic characteristics have 
not yet been elucidated [5]. 

In this study, we induced and purified the free Sfl 
phage particles from S. flexneri serotype la clinical strain 
019 and characterized its morphology, host range and 
genomic features. 

Results and discussion 

Isolation of phage Sfl from S. flexneri serotype la strain 
019 

Using the conditions described in Methods, we induced 
the Sfl phage from serotype la strain 019. Plaques were 
observed on the semi-solid LB agar when the host strain 
036 was infected with induced products from strain 019. 
Lysogens isolated from plaques were serologically identi- 
fied as serotype la, characterized by agglutination with 
both typing sera I and grouping sera 3;4. PCR amplifica- 
tion indicated that the Sfl specific gene gtrl is present on 
both phage particles and the lysogens. These results sug- 
gest that phage Sfl has been successfully induced and 
isolated from strain 019. This is the first report of isolation 
of free Sfl particles from S. flexneri. 



The morphology of Sfl is characteristic of the Myoviridae 
family 

The purified Sfl phage particles were morphologically 
analyzed using electron microscopy. The phage has a he- 
xagonal head of ca. 55 nm in diameter, a knob-like neck, a 
contractile tail of ca. 110 nm, and a tail sheath of ca. 
55 nm (Figure 1). There are indications of a baseplate-like 
structure and long tail fibers, but no other distinctive 
features could be seen (Figure 1). These characteristics 
suggest that phage Sfl is a member of the Myoviridae fam- 
ily in the order Caudovirale [19]. 

In comparison to other morphologically characterized 
serotype-converting phages Sf6, SfV, Sfll and SfX, Sfl 
has a very similar appearance to Sfll and SfV [8,11], but 
distinctive from SfX and Sf6 [12,20]. The microscopic 
difference reflected the genetic divergence among them 
in that the Sfl packaging and structure genes were iden- 
tical to those of phage SfV, but divergent from those of 
SfX and Sf6 (see below, Figure 2). 

Phage Sfl has a very narrow host range 

Host specificity of serotype-converting bacteriophages has 
long been recognized, which results in the specific lytic 
spectrum and serotype conversion of S. flexneri in nature 
[20]. The recognition between the O-antigen of host bac- 
terium and the tail component of a phage is the key mech- 
anism of host specificity [20]. To determine the host 
range of Sfl, 132 S. flexneri strains of 12 serotypes (la, lb, 
2a, 2b, 3a, 3b, 4a, 4b, 5a, Y, X and Xv) were tested 
following the methods described in the Methods. Apart 
from 10 serotype Y strains, which were all converted to 
serotype la as expected, the 24 serotype X strains tested 




Figure 1 Electron micrograph of 5. flexneri bacteriophage Sfl 

stained with phosphotungstic acid. 
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Figure 2 Genetic map of S. flexneri bacteriophage Sfl and comparison of Sfl with related phages and prophages. The Sfl genome is 
shown to scale. Numbers below the scale bar are the number of base pairs. Arrows above the scale represent the predicted genes and 
orientation. ORFs were shown within the boxes, and putative function modules and names of genes are given above. Vertical black lines indicate 
cos and attP sites respectively. Red arrows indicate tRNA genes. Pseudogenes are marked with a black asterisk. Below the scale, arrows represent 
homologous proteins of bacteriophages and prophages from different hosts with S. flexneri, E. coll and Salmonella framed within a green, red or 
blue box, respectively. Homologs between Sfl and other phages/prophages are shown in different colors with color coding corresponding to 
level of homology at amino acid level, with red of 100% identity and blue of > =50% identity. 



were also lysogenized, and converted to a newly named 
serotype Id [16]. The serotype Id strains were serologic- 
ally characterized as reacting with both serotype 1 specific 
I typing sera and serotype X specific 7;8 grouping sera 
[16]. Interestingly, such a serotype has already appeared in 
natural infections in Anhui and Henan provinces, China 
[21]. Except for serotypes Y and X, the other serotypes 
could not be lysogenized by phage Sfl. A possible explan- 
ation for the host range restriction of phage Sfl is phage 
immunity due to modification of the O-antigen as phage 
receptors [22]. 

Sfl uses a site-specific mechanism for DNA packaging and 
has the same attP core sequence as Sfll, SflV, SfV and SfX 

Restriction enzyme analysis revealed that phage Sfl has a 
linear but not circular genome (data not shown). Genomic 
comparison found that the Sfl prophage genome has simi- 
lar packaging genes to that of phage SfV; and the 
fragments adjacent to them were also highly similar to the 
cohesive end site (cos) of phage SfV [9], with only one base 
difference at the 5' end (T versus A). These data suggest 
that Sfl may use the same site-specific mechanism as SfV 
for packaging. Direct sequencing of the putative termini 
of the Sfl genome extracted from free phage particles and 



comparison of the corresponding regions with the Sfl pro- 
phage genome in strain 019 revealed a 10 nucleotide (5'- 
TGCCCGCCCC -3') gap in the Sfl phage genome. There- 
fore, we conclude that Sfl uses a cos mechanism for DNA 
packaging as postulated for phage SfV [9], and does not 
use a head full mechanism (pac) as for phage Sf6 and SfX 
[10,12]. 

Integration of lambdoid phages into the bacterial 
chromosome generally occurs by site-specific recombin- 
ation between the phage attP and the bacterial attB sites 
[23]. In all serotype-converting phages except for Sf6, 
the attP site is always found located immediately down- 
stream of the O-antigen modification genes, and 
preceded by the int and xis genes [6]. To determine the 
attP site of phage Sfl, the region between genes gtrA and 
intl of Sfl was PCR amplified and sequenced and a 
261 bp sequence was obtained, in which, 46 bases, 
ATTCGTAATGCGAAGGTCGTAGGTTCGACTCCTAT 
TATCGGCACCA, were found to be identical to the 
attR/attL core sequence of prophage Sfl in strain Y53 
[5] (Figure 3). In the lysogen of 036_la, the 261 nucleo- 
tide sequence was divided into two parts, located at op- 
posite ends of the Sfl prophage genome (Figure 3). 
Evidently, site-specific recombination occurred at this attP 
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site. The attP core sequence of Sfl is identical to that of S. 
flexneri serotype-converting phage Sfll, SfV and SfX, as 
well as that of serotype-converting phages p22 of Salmon- 
ella typhimurium and DLP12 of E. coli [5,8,24]. 

Characterization of Sfl genome sequence 

The complete genome sequence of Sfl was obtained by 
combining the Sfl prophage genome of host strain 019 
with the attP site obtained by PCR sequencing as above. 
Firstly, the whole genome sequence of host strain 019 was 
sequenced using Illumina Solexa sequencing. A total of 
4,382,674 reads were generated to reach about 110-fold 
coverage and assembled de novo into 376 contigs and 
scaffolds. The Sfl prophage genome located between genes 
int and gtrIA was extracted from one of the contigs which 
was further assembled with the attP site sequence obtained 
above to construct a circular phage Sfl genome. To revert 
to the linear organization as usual practice, we artificially 
linearised the sequence starting from the terminase small 
subunit gene and ending with the cos site (Figure 2). 

The genome size of Sfl is 38,389 bp similar to that of 
sequenced S. flexneri serotype-converting phages Sf6 
(39,043 bp) [9], SfV (37,074 bp) [10] and SfX (37,355) 



(unpublished data). The overall G + C content is 50.12%, 
which is very similar to that of its host (50.9%) [25]. 
Sixty-six putative ORFs (including one pseudogene) were 
predicted and their functions are listed in the Additional 
file 1: Table SI. 

The genetic architecture of the Sfl genome is similar to 
that of sequenced S. flexneri serotype-converting phages 
SfV, Sf6 and SfX: the left-most region encodes genes for 
phage packaging and structure, followed by the middle re- 
gion with genes involved in serotype conversion, integra- 
tion/excision, recombination, immunity and regulation, 
replication and the Nin region, and then the putative lysis 
cassette at the right-most region ending with the cos site 
of the phage genome (Figure 2). The genomic structure of 
Sfl is also similar to that of phage SfV and lambda. Thus it 
belongs to the family of lambdoid phages. 

tRNAscan was used to find tRNA genes. Two tRNA 
genes in tandem, with anticodons GUU for asparagine 
(Asn) and UGU for threonine (Thr), were found to be 
located downstream of gene Q (35,738 - 35,809 for Asn, 
and 35,818 - 35,890 for Thr). One or both of these tRNA 
genes were also to be found located at this position in 
phage Sf6, ST64T, PS3 and p21 [10,26,27]. A recent study 



« gtrIA « 

tacaaataactttaacatfftatttaatcaatgagctctgaaaggcatgaagtctatcatccaagtctca| 
|attgatcgatacttgctatgtctgatgagacaaaactgagacacataaggcctcacaatggcttgcaa| 
A Iggctttacatgttttgatgtggtgggacgtgtgagcgcagtgttgatgggataatcctttgaattac^ 
|agcggattcttataattcgtaatgcgaaggtcgtaggttcgactcctattatcggcaccattaaaatca| 
Iataa^ ttacacatcattagtaccttc 

« intl « 

» proA » 

accattcgtgcgtaaKtaaaaccgggtgatgcaaaagcagccatttgattcacaaggccattgacgca| 
g [rcgcccggttagttttaaccttgtccaccgtgattcacgttcgtgaacatgtcctttcagggccgat all 
KgctcagttggtagagcagcgcattcgtaatgcgaaggtcgtaggttcgactcctattatcggcaccaI 
[tttaaatcaataa^ ttacacatcattagtaccttc 

«putative phage int « 



» pro A » 

accattcgtgcgtaa^taaaaccgggtgatgcaaaagcagccatttgattcacaaggccattgacgca! 
ItcgcccggttagttttaaccttgtccaccgtgattcacgttcgtgaacatgtcctttcagggccgataiI 
Iagctcagttggtagagcagcgcattcgtaatgcgaaggtcgtaggttcgactcctattatcggcacca] 

|ttaaaatcaataag| ttacacatcattagtaccttc 

« intl « 



« gtrIA « 

taactttaacat^tatttaatcaatgagctct 

Igatacttgctatgtctgatgagacaaaactgagacacataaggcctcacaatggcttgcaaggcttta! 
d |catgttttcatgtgct 

[tqimaattcgtaatgcgaaggtcgtaggttcgactcctattatcggcaccatttaaatcaataa^tta 
cacatcattagt 

«putative phage int « 

Figure 3 DNA sequences of chromosomal integration site of S. flexneri phage Sfl. Sequences obtained by PCR and sequencing of junction 

regions using a series of primers across the integration site. (A) ottP in phage Sfl. (B) ottB in strain 036. (C) ottL in strain 036_1a. (D) ottR in 036_1a. 

Sequences in box are DNA regions between conserved genes; Underlined sequences are tRNA-thrW; Sequences in blue are ott core sequence; 

Conserved genes are shaded and their transcription orientation is marked by an arrow. 
\ ) 
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suggested that phage-encoded tRNA could serve to sup- 
plement the host tRNA reservoir, allowing the rare codons 
in the phage to be more efficiently decoded [28]. Codon 
analysis indeed found a convincing bias of ACA (anti- 
codon UGU) in the Sfl genome when compared to its S. 
flexneri host (with 17.3% in phage Sfl, and 7.1% in strain 
Sf301), but no obvious bias was observed on CAA (anti- 
codon GUU), and the significance of the tRNA-Asn in Sfl 
is not clear. 

Genomic comparison reveals that Sfl is genetically related 
to Shigella phage SfV, E coli prophage el 4 and lambda 

The ORFs encoded in the Sfl genome were searched 
against the GenBank database at both DNA and amino 
acid levels. Sfl encoded proteins exhibited homology to 
various phages and prophages originating from various 
hosts, including Shigella (SfV, Sf6 and SfX), E. coli (lambda, 
phip27, VTl-sakai, BP-4795, 933 W, 1717, 2851, Stxl, 
Stx2, VT2-Sa, YYZ-2008, 86, M27 and el4) and Salmonella 
(ST64B, p22-pbi, SE1, ST104, ST64T and epsilon34). 
Figure 2 displays the homologies of phage Sfl to other 
phages. The Sfl genes involved in phage packaging and 
morphogenesis are homologous and organized in a similar 
manner to those of phage SfV, phi-p27, ST64B and pro- 
phage el4. As reported earlier [6], the O- antigen modifica- 
tion and integration and excision modules (gtrA, gtrB, int 
and xis) are homologous to that of serotype-converting 
bacteriophages from S. flexneri (SfV and SfX) and Salmon- 
ella (p22-pbi, SE1, ST104, ST64T and epsilon34). However, 
the early and regulatory regions located in the right half of 
the genome were homologous to that of lambda and Shiga 
toxin- 1 and Shiga toxin- 2 phages (phip27, VTl-sakai, BP- 
4795, 933 W, 1717, 2851, Stxl, Stx2, VT2-Sa, YYZ-2008, 
86 and M27). 

Therefore Sfl is a mosaic phage with its left half most 
homologous to phage SfV (91.6% - 100% identity at pro- 
tein level, and 89-98% at DNA level [ORF by ORF com- 
parison]) and E. coli prophage el4 (94.0% - 100% identity 
at protein level, and 97% at DNA level) and right half most 
homologous to Lambda (67% - 100% identity at protein 
level, and 80 - 98% at DNA level). Homology to SfV 
encompasses at least 23 ORFs encoding functions for 
morphology {orfl, or/2 and orf9 - or/24), O-antigen modi- 
fication {or/26, or/27), integration/excision (orf28 to orf29) 
(Figure 2, Table 1). The homologous ORFs are located in 
four contiguous regions, amounting to 17,487 bp 
nucleotides and accounting for 45.6% of the entire phage 
genome (Table 1). Sfl also shared genetic relatedness with 
the E. coli prophage el4. The homologous regions mainly 
encode proteins responsible for phage assembly and mor- 
phogenesis and are located in the left half of the Sfl gen- 
ome (Figure 2 and Table 1). The homologous regions 
account for 46% of the Sfl genome. Based on the hom- 
ology of the first 22 ORFs (Additional file 2: Figure SI), it 



seems that Sfl is closer to el4 than to SfV since 5 ORFs 
(Sfl orfl to or/7) are highly homologous between Sfl and 
el4, but share little homology between Sfl and SfV. For 
the remaining 17 ORFs except or/8, the pairwise percent- 
age identities are very similar between Sfl, SfV and el4. 
On the other hand, the homology between Sfl and SfV 
extends further to orf28 with high homology of orf23, 
or/24 and orf26 to orf28. Similarly, six contiguous DNA 
segments, which account for 28.4% of the Sfl genome, 
were found to be homologous to the corresponding 
regions of lambda. These homologous regions are mainly 
located in the early and regulatory regions, and encode 
functional modules for phage recombination {or/35 to 
or/43), immunity and regulation (orf45 to or/50), replica- 
tion (or/51, or/52), Nin region {or/53 to or/55, or/57 to 
or/60), and part of the lysis module {or/64) (Figure 2 and 
Table 1). Thus a total of 72.9% of the Sfl genome is hom- 
ologous to either SfV, el4 or lambda. 

Conclusions 

The serotype-converting bacteriophage Sfl was isolated 
from a S. flexneri serotype la strain. It had a narrow lytic 
pattern and converted only serotype Y to serotype la and 
serotype X to serotype Id. Morphologically Sfl is a member 
of the Myoviridae family in the order of Caudovirale. Gen- 
omic analysis revealed that Sfl contains features of lambd- 
oid phages and is closely related to S. flexneri phage SfV, E. 
coli prophage el4 and lambda. The characterization of 
serotype-converting phage Sfl enhances our understanding 
of serotype conversion of S. flexneri. 

Methods 

Bacterial strains, media and culture 

S. flexneri serotype la strain 019 [16] was used as the 
source for induction of phage Sfl. S. flexneri strain 036 
(serotype Y) was used as the host for phage infection and 
large volume propagation of Sfl [16]. One hundred and 
thirty two S. flexneri strains of 12 serotypes (17 serotype 
la, 5 serotype lb, 10 serotype 2a, 10 serotype 2b, 10 sero- 
type 3a, 2 serotype 3b, 5 serotype 4a, 5 serotype 4b, 4 sero- 
type 5a, 10 serotype Y, 24 serotype X and 30 serotype Xv) 
were used for phage host range detection. All S. flexneri 
strains used in this study were isolated from diarrheal 
patients in China, or purchased from National Collec- 
tion of Type Cultures (NCTC), UK. S. flexneri strains 
were serologically identified using Shigella antisera Kits 
(Denka Seiken, Japan) and monoclonal antibody reagents 
(Reagensia AB, Sweden). S. flexneri strains were routinely 
cultured on LB agar or in LB broth with shaking at 37°C. 

Induction of phage Sfl 

Induction of phage Sfl was performed as methods 
described by Mavris et al [8]. Briefly, a freshly grown 
colony of strain 019 was incubated in 10 ml LB broth 
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Table 1 Homology of Sfl to 5. flexneri phage SfV and E. coli prophage el 4 and lambda 


Phage or prophage 


Nucleotide position 


Homologous nucleotide position 
in Sfl (total length [bp]) 


% identity at 
nucleotide level 


Sfl ORFs° 


% of Sfl 
genome 


SfV 


9-2,211 


2 - 2,194 (2,193) 


98 


orfl, (orf2) 


45.6 




5,793 - 1 7,782 


6,053 - 18,042 (11,990) 


97 


orf9 - orf24 






19,146 - 22,042 


19,787 - 22,681 (2,895) 


98 


(or/26), orf2 - orf29, ottP 






36,666 - 37,074 


37,964 - 38,372 (409) 


89 


(orf66) 




Lambda 


30,418 - 30,910 
31,206 - 34,381 
35,104 - 35,386 
35,496 - 41,084 


23,002 - 23,493 (491) 
24,281 - 27,456 (3,176) 
27,708 - 27,990 (283) 
28,052 - 33,640 (5,590) 


95 
98 
98 
98 


(orf31), orf32, (orf33) 
(orf35), orf36 - orf43 
(orf45) 

orf46 - orf55 


28.4 




42,097 - 43,068 


2 - 2,194 (2,193) 


97 


orf57 - off 59, {orf60) 






45,966 - 46,361 


6,053 -18,042 (11,990) 


80 


(orf64) 




e14 


2,840,259 - 2,859,298 b 


1 - 17,234, 36,721 - 38,389 (17,660) 


97 


orfl - orf22, {orf66) 


46% 



a Parentheses indicate that the region of homology starts or ends within an ORF. 
b E. coli S88 strain genome (accession no. CU928161). 

overnight with vigorous shaking. After being induced 
for 30 min at 56°C with aeration, the cultures were 
centrifuged, and the supernatants were filtered through a 
0.22 mm membrane filter (Promega) to remove bacterial 
cells. The filtrates were either used directly for phage in- 
fection assay or stored at 4°C with addition of 10% (v/v) 
chloroform. 

Phage infection and lysogenization 

S. flexneri strain 036 cells were prepared using the 
methods for phage lambda [29]. Phage infection and 
lysogenization were performed using the methods 
described previously [16]. The serotypes of isolated col- 
onies were identified by slide agglutination assay. Large 
volume phage purification was performed on S. flexneri 
strain 036, according to the methods for phage Sfll [8]. 

Electron microscopy 

The purified phages were absorbed on carbon-coated 
copper grids (300 mesh) and negatively stained with 2% 
(w/v) sodium phosphotungstate (pH 7.0). Samples were 
visualized with a Hitachi 600 electron microscope at 80 kV. 



ATGGGAGGC -30, proA-Y (5'- AC A A AG CG A A ATC A 
TCCTCAA -30, and yaiC-R (5'- GCAGGAAACCACC 
ATCAACACC -30, which are complementary to the 
genes gtrl and intl in phage Sfl, and proA and yaiC in S. 
flexneri chromosome, respectively, were used to identify 
the attP and attB sites of phage Sfl and strain 036, as 
well as the attR and attL regions of the Sfl lysogen. PCR 
was conducted using the Sensoquest labcycler PCR 
System (SENSO, German) under standard protocol. The 
PCR products were either cloned into TA vector pMD20- 
T (TaKaRa) for sequencing or sequenced directly. 

To determine the cohesive ends of the Sfl phage, two 
primers, cos-F: 5'- ATG CC ACC ACG A ACCCC A A A AG -3' 
(nt 37,964 - 37,985, complementary to Sfl genome 
sequence), cos-R: 5'- GGCTTGGGGCGACGCCCGGA -3' 
(nt 72-91, complementary to Sfl genome), were designed 
to sequence the putative termini of the Sfl genome directly 
using phage DNA as the template. The phage genome ends 
obtained were further compared to the corresponding 
regions of the Sfl prophage genome in strain 019. The 
missing region in the former sequence is the putative cos 
site of phage Sfl. 



Host range detection 

To determine the host range of phage Sfl, one hundred 
and thirty two S. flexneri strains of 12 serotypes were 
infected with Sfl. The preparation of component cells, 
phage infection and lysogen isolation were performed as 
methods for strain 036 above. The Sfl host range was 
determined by observing the presence of plaques and 
serologically identification of the lysogens. 

Identification of the chromosomal integration site and 
cohesive ends (cos sites) of phage Sfl 

Oligonucleotide primers gtrl-Y (5'- ATTGAACGCCTCC 
TTGCTATGC -30, inti-R (5'- AGTGTTACAGGAA 



Genome sequencing and analysis 

To obtain the entire phage genome sequence of Sfl, the 
whole genome of source strain 019 was sequenced by 
Illumina Solexa sequencing. A paired-end (PE) library with 
an average insertion length of between 500 bp and 
2,000 bp was constructed. Reads were generated with 
Illumina Solexa GA IIx (Illumina, San Diego, CA) and 
assembled into scaffolds using SOAP denovo (Release 1.04). 
The sequence between genes intl and gtrA was extracted 
for further analysis. By assembling with the sequence amp- 
lified from Sfl DNA using primer pair gtrl-Y and int-R 
mentioned above, the entire sequence of Sfl genome in its 
circular state was obtained. Open reading frames (ORFs) of 



Sun et al. BMC Microbiology 2013, 13:39 
http://www.biomedcentral.eom/1 471 -21 80/1 3/39 



Page 7 of 8 



Sfl were determined using the ORF Finder program, which 
is accessible through the National Center for Biotechnol- 
ogy Information (NCBI). Searches for homologous DNA 
and protein sequences were conducted with the BLAST 
software against the non-redundant GenBank database 
(http://www.ncbi.nlm.nih.gov/blast/blast/). tRNA genes 
were determined with tRNAscan-SE Search server 
(http://lowelab.ucsc.edu/tRNAscan-SE). 

Nucleotide accession number 

The genomic sequence of phage Sfl has been deposited 
in GenBank as accession number JX509734. 
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